{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Ungraded Lab:  Overfitting in Logistic Regression.\n",
    "\n",
    "The lectures describe **Overfitting**. This is when the model follows the data too closely and does not generalize well. In this lab we will explore overfitting in logistic regression and how regularization can improve situation.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Goals\n",
    "In this lab you will:\n",
    "- use `map_features` to extend the features of a data set\n",
    "- explore the resulting overfitting\n",
    "- utilize regularization to reduce overfitting\n",
    "- reduce features to match the data and reduce overfitting."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Outline\n",
    "- [Tools](#tools)\n",
    "- [Dataset](#dataset)\n",
    "- [Polynomial Feature Map](#FeatureMap)\n",
    "- [Fit the Model](#FitModel)\n",
    "- [Reducing Overfitting](#ReduceOverfitting)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Overfitting\n",
    "In this lab, we will explore how overfitting happens and what can be done about it.\n",
    "- Create a logistic dataset with an irregular boundary\n",
    "- Create an overfitting problem\n",
    "    - polynomial Regression and Feature mapping\n",
    "- Regularization to reduce overfitting\n",
    "<a name='tools'></a>\n",
    "## Tools \n",
    "- We have not yet developed all the capabilities to do gradient decent with regularization so we will utilized sklearn's LogisticRegression capabilities explored briefly in a previous lab. \n",
    "- Plotting is very useful when exploring decision boundaries. We will utilize matplotlib. Producing these plots is quite involved so helper routines are provided below.\n",
    "- We will create a polynomial feature set. `map_features` is provided to simplify that process"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from IPython.display import Markdown as md\n",
    "%matplotlib widget\n",
    "import matplotlib.pyplot as plt\n",
    "plt.style.use('./deeplearning.mplstyle')\n",
    "plt.rcParams['font.size'] = 8\n",
    "from sklearn.linear_model import LogisticRegression\n",
    "from sklearn.linear_model import LinearRegression, Ridge\n",
    "from sklearn import preprocessing\n",
    "from plt_overfit import map_feature, plot_decision_boundary, plt_overfit\n",
    "from lab_utils_common import dlc, plot_data, zscore_normalize_features, gradient_descent, predict_logistic, predict_linear"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "def map_one_feature(X1, degree):\n",
    "    \"\"\"\n",
    "    Feature mapping function to polynomial features    \n",
    "    \"\"\"\n",
    "    X1 = np.atleast_1d(X1)\n",
    "    out = []\n",
    "    str = \"\"\n",
    "    k = 0\n",
    "    for i in range(1, degree+1):\n",
    "        out.append((X1**i))\n",
    "        str = str + f\"w_{{{k}}}{munge('x_0',i)} + \"\n",
    "        k += 1\n",
    "    str = str + ' b' #add b to text equation, not to data\n",
    "    return np.stack(out, axis=1), str \n",
    "\n",
    "def munge(base,exp):\n",
    "    if exp == 0:\n",
    "        return ('')\n",
    "    elif exp == 1:\n",
    "        return (base)\n",
    "    else:\n",
    "        return (base + f'^{{{exp}}}')\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "$w_{0}x_0 + w_{1}x_0^{2} + w_{2}x_0^{3} +  b$"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "w_{0}x_0 + w_{1}x_0^{2} + w_{2}x_0^{3} +  b\n",
      "(3, 3)\n",
      "[[ 1.  1.  1.]\n",
      " [ 2.  4.  8.]\n",
      " [ 3.  9. 27.]]\n"
     ]
    }
   ],
   "source": [
    "a = np.array([1.,2,3])\n",
    "a_mapped, a_eq = map_one_feature(a,3)\n",
    "display(md(f\"${a_eq}$\"))\n",
    "print(a_eq)\n",
    "print(a_mapped.shape)\n",
    "print(a_mapped)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name='dataset'></a>\n",
    "##  Dataset\n",
    "Below we create a logistic dataset with two features based on a quadratic. Random noise is added to create a scenario where the model can overfit. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "5b0f0bfde1204e71a7b8379feceb7bfa",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "m = 50\n",
    "n = 2\n",
    "np.random.seed(2)\n",
    "X_train = 2*(np.random.rand(m,n)-[0.5,0.5])\n",
    "y_train = X_train[:,1]+0.5  > X_train[:,0]**2 + 0.5*np.random.rand(m) #quadratic + random\n",
    "y_train = y_train + 0  #convert from boolean to integer\n",
    "\n",
    "fig, ax = plt.subplots(1,1,figsize=(4,4))\n",
    "plot_data(X_train, y_train, ax, s=10, loc='lower right')\n",
    "ax.set_title(\"Logistic data set with noise\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "from matplotlib.gridspec import GridSpec\n",
    "from matplotlib.widgets import Button, CheckButtons\n",
    "from matplotlib.patches import FancyArrowPatch\n",
    "import math\n",
    "# for debug\n",
    "from ipywidgets import Output\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "e7a05c73511d4b50a206cd409cfa0ca4",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Output()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "output = Output() # sends hidden error messages to display when using widgets\n",
    "display(output)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
    "class button_manager:\n",
    "    ''' Handles some missing features of matplotlib check buttons \n",
    "    on init: \n",
    "        creates button, links to button_click routine, \n",
    "        calls call_on_click with active index and firsttime=True\n",
    "    on click:\n",
    "        maintains single button on state, calls call_on_click\n",
    "    '''\n",
    "    @output.capture()  # debug\n",
    "    def __init__(self,fig, dim, labels, init, call_on_click):\n",
    "        ''' \n",
    "        dim: (list)     [leftbottom_x,bottom_y,width,height]\n",
    "        labels: (list)  for example ['1','2','3','4','5','6']\n",
    "        init: (list)    for example [True, False, False, False, False, False]\n",
    "        '''\n",
    "        self.fig = fig\n",
    "        self.ax = plt.axes(dim)  #lx,by,w,h\n",
    "        self.init_state = init    \n",
    "        self.call_on_click = call_on_click\n",
    "        self.button  = CheckButtons(self.ax,labels,init)\n",
    "        self.button.on_clicked(self.button_click)\n",
    "        self.status = self.button.get_status()\n",
    "        self.call_on_click(self.status.index(True),firsttime=True)\n",
    "        \n",
    "    @output.capture()  # debug\n",
    "    def reinit(self):\n",
    "        self.status = self.init_state\n",
    "        self.button.set_active(self.status.index(True))      #turn off old, will trigger update and set to status\n",
    "  \n",
    "    @output.capture()  # debug\n",
    "    def button_click(self, event):\n",
    "        ''' maintains one-on state. If on-button is clicked, will process correctly '''\n",
    "        new_status = self.button.get_status()\n",
    "        new = [self.status[i] ^ new_status[i] for i in range(len(self.status))]\n",
    "        newidx = new.index(True)\n",
    "        self.button.eventson = False\n",
    "        self.button.set_active(self.status.index(True))  #turn off old or reenable if same\n",
    "        self.button.eventson = True\n",
    "        self.status = self.button.get_status()\n",
    "        self.call_on_click(self.status.index(True))\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "4c546c1900764122a9282b6b4aadecb2",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "6644f7efcb334f708bea8510fec54231",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "class overfit_example():\n",
    "    def __init__(self, X, y, w_in, b_in, regularize=False):\n",
    "        self.X = X\n",
    "        self.y = y\n",
    "        self.w = w_in\n",
    "        self.b = b_in\n",
    "        self.regularize=regularize\n",
    "        self.lambda_=0\n",
    "        fig = plt.figure( figsize=(8,6))\n",
    "        fig.canvas.toolbar_visible = False\n",
    "        fig.canvas.header_visible = False\n",
    "        fig.canvas.footer_visible = False\n",
    "        fig.set_facecolor('#ffffff') #white\n",
    "        gs  = GridSpec(5, 3, figure=fig)\n",
    "        ax0 = fig.add_subplot(gs[0:3, :])\n",
    "        ax1 = fig.add_subplot(gs[-2, :])\n",
    "        ax2 = fig.add_subplot(gs[-1, :])\n",
    "        ax1.set_axis_off()\n",
    "        ax2.set_axis_off()\n",
    "        self.ax = [ax0,ax1,ax2]\n",
    "        self.fig = fig\n",
    "\n",
    "        #pos = ax2.get_position().get_points()  ##[[lb_x,lb_y], [rt_x, rt_y]]\n",
    "        #print(pos)\n",
    "        self.axfitdata = plt.axes([0.26,0.124,0.10,0.1 ])  #lx,by,w,h\n",
    "        self.bfitdata  = Button(self.axfitdata , 'fit data', color=dlc['dlblue'])\n",
    "        self.bfitdata.label.set_fontsize(12)\n",
    "        self.bfitdata.on_clicked(self.fitdata_clicked)\n",
    "\n",
    "        self.cid = fig.canvas.mpl_connect('button_press_event', self.add_data)\n",
    "\n",
    "        self.typebut = button_manager(fig, [0.4, 0.07,0.15,0.15], [\"Regression\", \"Categorical\"],\n",
    "                                       [False,True], self.toggle_type)\n",
    "\n",
    "        self.fig.text(0.1, 0.02+0.21, \"Degree\", fontsize=12)\n",
    "        self.degrbut = button_manager(fig,[0.1,0.02,0.15,0.2 ], ['1','2','3','4','5','6'], \n",
    "                                        [True, False, False, False, False, False], self.update_equation)\n",
    "        if self.regularize:\n",
    "            self.fig.text(0.6, 0.02+0.21, r\"lambda($\\lambda$)\", fontsize=12)\n",
    "            self.lambut = button_manager(fig,[0.6,0.02,0.15,0.2 ], ['0.0','0.2','0.4','0.6','0.8','1'], \n",
    "                                        [True, False, False, False, False, False], self.updt_lambda)\n",
    "   \n",
    "        #self.regbut =  button_manager(fig, [0.8, 0.08,0.24,0.15], [\"Regularize\"],\n",
    "        #                               [False], self.toggle_reg)\n",
    "        #self.logistic_data()\n",
    "    \n",
    "    def updt_lambda(self, idx, firsttime=False):\n",
    "        self.lambda_ = idx * 0.2\n",
    "        \n",
    "    def toggle_type(self, idx, firsttime=False):\n",
    "        self.logistic = True if idx==1 else False\n",
    "        self.ax[0].clear()\n",
    "        if self.logistic:\n",
    "            self.logistic_data()\n",
    "        else:\n",
    "            self.linear_data()\n",
    "        if not firsttime: self.degrbut.reinit()\n",
    "        \n",
    "    def logistic_data(self,redraw=False):\n",
    "        if not redraw:\n",
    "            m = 50\n",
    "            n = 2\n",
    "            np.random.seed(2)\n",
    "            X_train = 2*(np.random.rand(m,n)-[0.5,0.5])\n",
    "            y_train = X_train[:,1]+0.5  > X_train[:,0]**2 + 0.5*np.random.rand(m) #quadratic + random\n",
    "            y_train = y_train + 0  #convert from boolean to integer\n",
    "            self.X = X_train\n",
    "            self.y = y_train \n",
    "\n",
    "        #plot_data(X_train, y_train, self.ax[0], s=10, loc='lower right')\n",
    "        plot_data(self.X, self.y, self.ax[0], s=10, loc='lower right')\n",
    "        self.ax[0].set_title(\"Logistic data set with noise\")\n",
    "        self.ax[0].text(0.5,0.93, \"Click on plot to add data. Hold [Shift] for blue(y=0) data.\",\n",
    "                        fontsize=12, ha='center',transform=self.ax[0].transAxes, color=dlc[\"dlblue\"])\n",
    "        self.ax[0].set_xlabel(r\"$x_0$\") \n",
    "        self.ax[0].set_ylabel(r\"$x_1$\")         \n",
    "    \n",
    "    def linear_data(self,redraw=False):\n",
    "        if not redraw:\n",
    "            m = 30\n",
    "            n = 2\n",
    "            c = 0\n",
    "            x_train = np.arange(0,m,1)\n",
    "            np.random.seed(1)\n",
    "            y_ideal = x_train**2 + c\n",
    "            y_train = y_ideal + 0.7 * y_ideal*(np.random.sample((m,))-0.5)\n",
    "            self.x_ideal = x_train #for redraw when new data included in X\n",
    "            self.X = x_train\n",
    "            self.y = y_train\n",
    "            self.y_ideal = y_ideal\n",
    "        else:\n",
    "            self.ax[0].set_xlim(self.xlim)\n",
    "            self.ax[0].set_ylim(self.ylim)\n",
    "\n",
    "        self.ax[0].scatter(self.X,self.y, label=\"y\")\n",
    "        self.ax[0].plot(self.x_ideal, self.y_ideal, \"--\", color = \"orangered\", label=\"y_ideal\", lw=1)\n",
    "        self.ax[0].set_title(\"OverFitting Example: Linear Data Set (quadratic with noise)\",fontsize = 14)   \n",
    "        self.ax[0].set_xlabel(\"x\"); self.ax[0].set_ylabel(\"y\")\n",
    "        self.ax0ledgend = self.ax[0].legend(loc='lower right')\n",
    "        self.ax[0].text(0.5,0.93, \"Click on plot to add data\",\n",
    "                        fontsize=12, ha='center',transform=self.ax[0].transAxes, color=dlc[\"dlblue\"])\n",
    "        if not redraw:\n",
    "            self.xlim = self.ax[0].get_xlim()\n",
    "            self.ylim = self.ax[0].get_ylim()\n",
    "\n",
    "\n",
    "    @output.capture()  # debug\n",
    "    def add_data(self, event):\n",
    "        if self.logistic:\n",
    "            self.add_data_logistic(event)\n",
    "        else:\n",
    "            self.add_data_linear(event)\n",
    "\n",
    "    @output.capture()  # debug\n",
    "    def add_data_logistic(self, event):\n",
    "        if event.inaxes == self.ax[0]:\n",
    "            x0_coord = event.xdata\n",
    "            x1_coord = event.ydata\n",
    "            \n",
    "            if event.key == None:\n",
    "                self.ax[0].scatter(x0_coord, x1_coord, marker='x', s=10, c = 'red', label=\"y=1\")\n",
    "                self.y = np.append(self.y,1)\n",
    "            else:\n",
    "                self.ax[0].scatter(x0_coord, x1_coord, marker='o', s=10, label=\"y=0\", facecolors='none',\n",
    "                                   edgecolors=dlc['dlblue'],lw=3)\n",
    "                self.y = np.append(self.y,0)\n",
    "            self.X = np.append(self.X,np.array([[x0_coord, x1_coord]]),axis=0)\n",
    "        self.fig.canvas.draw()\n",
    "        \n",
    "    def add_data_linear(self, event):\n",
    "        if event.inaxes == self.ax[0]:\n",
    "            x_coord = event.xdata\n",
    "            y_coord = event.ydata\n",
    "            \n",
    "            self.ax[0].scatter(x_coord, y_coord, marker='o', s=10, facecolors='none',\n",
    "                                   edgecolors=dlc['dlblue'],lw=3)\n",
    "            self.y = np.append(self.y,y_coord)\n",
    "            self.X = np.append(self.X,x_coord)\n",
    "            self.fig.canvas.draw()\n",
    "\n",
    "    @output.capture()  # debug\n",
    "    def fitdata_clicked(self,event):\n",
    "        if self.logistic == True:\n",
    "            self.logistic_regression()\n",
    "        else:\n",
    "            self.linear_regression()\n",
    "        \n",
    "    def linear_regression(self):\n",
    "        self.ax[0].clear()\n",
    "        self.fig.canvas.draw()\n",
    "\n",
    "        # create and fit the model using our mapped_X feature set.\n",
    "        self.X_mapped, _ =  map_one_feature(self.X, self.degree)\n",
    "        self.X_mapped_scaled, self.X_mu, self.X_sigma  = zscore_normalize_features(self.X_mapped)\n",
    "        \n",
    "        #linear_model = LinearRegression()\n",
    "        linear_model = Ridge(alpha=self.lambda_, normalize=True, max_iter=10000)\n",
    "        linear_model.fit(self.X_mapped_scaled, self.y ) \n",
    "        self.w = linear_model.coef_.reshape(-1,)\n",
    "        self.b = linear_model.intercept_\n",
    "        x = np.linspace(*self.xlim,30)  #plot line idependent of data which gets disordered\n",
    "        xm, _ =  map_one_feature(x, self.degree)\n",
    "        xms = (xm - self.X_mu)/ self.X_sigma\n",
    "        y_pred = linear_model.predict(xms)\n",
    "        \n",
    "        #self.fig.canvas.draw()\n",
    "        self.linear_data(redraw=True)\n",
    "        self.ax0yfit = self.ax[0].plot(x, y_pred, color = \"blue\", label=\"y_fit\")\n",
    "        self.ax0ledgend = self.ax[0].legend(loc='lower right')\n",
    "        self.fig.canvas.draw()\n",
    "\n",
    "    def logistic_regression(self):\n",
    "        self.ax[0].clear()\n",
    "        self.fig.canvas.draw()\n",
    "\n",
    "        # create and fit the model using our mapped_X feature set.\n",
    "        self.X_mapped, _ =  map_feature(self.X[:, 0], self.X[:, 1], self.degree)\n",
    "        self.X_mapped_scaled, self.X_mu, self.X_sigma  = zscore_normalize_features(self.X_mapped)\n",
    "        if self.regularize == False or self.lambda_ == 0:\n",
    "            lr = LogisticRegression(penalty='none', max_iter=10000)\n",
    "        else:\n",
    "            C = 1/self.lambda_\n",
    "            lr = LogisticRegression(C=C, max_iter=10000)\n",
    "\n",
    "        lr.fit(self.X_mapped_scaled,self.y)\n",
    "        #print(lr.score(self.X_mapped_scaled, self.y))\n",
    "        self.w = lr.coef_.reshape(-1,)\n",
    "        self.b = lr.intercept_\n",
    "        #print(self.w, self.b)\n",
    "        self.logistic_data(redraw=True)\n",
    "        self.contour = plot_decision_boundary(self.ax[0],[-1,1],[-1,1], self.y, predict_logistic, self.w, self.b, \n",
    "                       scaler=True, mu=self.X_mu, sigma=self.X_sigma, degree=self.degree )\n",
    "        self.fig.canvas.draw()\n",
    "\n",
    "    @output.capture()  # debug\n",
    "    def update_equation(self, idx, firsttime=False):\n",
    "        #print(f\"Update equation, index = {idx}, firsttime={firsttime}\")\n",
    "        self.degree = idx+1\n",
    "        if firsttime:\n",
    "            self.eqtext = []\n",
    "        else:\n",
    "            for artist in self.eqtext:\n",
    "                #print(artist)\n",
    "                artist.remove()\n",
    "            self.eqtext = []\n",
    "        if self.logistic:\n",
    "            _, equation =  map_feature(self.X[:, 0], self.X[:, 1], self.degree)\n",
    "            str = 'f_{wb} = sigmoid(' \n",
    "        else:\n",
    "            _, equation =  map_one_feature(self.X, self.degree)\n",
    "            str = 'f_{wb} = ('\n",
    "        bz = 10\n",
    "        seq = equation.split('+')\n",
    "        blks = math.ceil(len(seq)/bz)\n",
    "        for i in range(blks):\n",
    "            if i == 0:\n",
    "                str = str +  '+'.join(seq[bz*i:bz*i+bz])\n",
    "            else:\n",
    "                str = '+'.join(seq[bz*i:bz*i+bz])\n",
    "            str = str + ')' if i == blks-1 else str + '+'\n",
    "            ei = self.ax[1].text(0.01,(0.75-i*0.25), f\"${str}$\",fontsize=9, transform = self.ax[1].transAxes, ma='left', va='top' )\n",
    "            self.eqtext.append(ei)\n",
    "        self.fig.canvas.draw()\n",
    "\n",
    "plt.close(\"all\")\n",
    "w_in = np.zeros_like(y_train)\n",
    "b_in = 0.\n",
    "ofit = overfit_example(X_train, y_train, w_in, b_in,True)\n",
    "ofit = overfit_example(X_train, y_train, w_in, b_in,False)\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "ename": "AttributeError",
     "evalue": "'overfit_example' object has no attribute 'xlim'",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mAttributeError\u001b[0m                            Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-32-3bb47f1efeb7>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlinspace\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0mofit\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mxlim\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m5\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mAttributeError\u001b[0m: 'overfit_example' object has no attribute 'xlim'"
     ]
    }
   ],
   "source": [
    "np.linspace(*ofit.xlim,5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class equation_manager:\n",
    "    \n",
    "    @output.capture()  # debug\n",
    "    def __init__(self,ax, logistic=False):\n",
    "        self.ax = ax\n",
    "        self.fig = ax.figure\n",
    "        self.logistic = logistic\n",
    "        self.init_state = [True, False, False, False, False, False]\n",
    "        self.axdegree = plt.axes([0.1,0.02,0.15,0.2 ])  #lx,by,w,h\n",
    "        self.button  = CheckButtons(self.axdegree, ['1','2','3','4','5','6'], self.init_state)\n",
    "        self.button.on_clicked(self.button_clicked)\n",
    "        self.degreetxt = self.fig.text(0.1, 0.02+0.21, \"Degree\", fontsize=12)\n",
    "        self.status = self.button.get_status()\n",
    "        #self.update_equation(self.status.index(True)+1, firsttime=True)\n",
    "        self.button_clicked(None, firsttime=True)\n",
    "        \n",
    "    def reinit(self, logistic):\n",
    "        self.logistic = logistic\n",
    "        self.button.eventson = False\n",
    "        self.button.set_active(self.status.index(True))  #turn off old\n",
    "        self.button.eventson = True\n",
    "        self.button.set_active(self.init_state.index(True))  #turn on init, trigger update            \n",
    "         \n",
    "    @output.capture()  # debug\n",
    "    def button_clicked(self, event, firsttime=False):\n",
    "        ''' firsttime is from __init__, not button push '''\n",
    "        new_status = self.button.get_status()\n",
    "        new = [self.status[i] ^ new_status[i] for i in range(len(self.status))]\n",
    "        newidx = new.index(True)\n",
    "        self.button.eventson = False\n",
    "        self.button.set_active(self.status.index(True))  #turn off old or reenable if same\n",
    "        self.button.eventson = True\n",
    "        self.status = self.button.get_status()\n",
    "        self.update_equation(newidx+1,firstttime)\n",
    "        self.degree = self.status.index(True)+1\n",
    "   \n",
    "    @output.capture()  # debug\n",
    "    def update_equation(self, degree, firsttime=False):\n",
    "        if firsttime:\n",
    "            self.eqtext = []\n",
    "        else:\n",
    "            for artist in self.eqtext:\n",
    "                #print(artist)\n",
    "                artist.remove()\n",
    "            self.eqtext = []\n",
    "\n",
    "        self.X_mapped, equation =  map_feature(X_train[:, 0], X_train[:, 1], degree)\n",
    "        bz = 10\n",
    "        seq = equation.split('+')\n",
    "        blks = math.ceil(len(seq)/bz)\n",
    "        for i in range(blks):\n",
    "            if i == 0:\n",
    "                str = 'f_{wb} = sigmoid('  + '+'.join(seq[bz*i:bz*i+bz])\n",
    "            else:\n",
    "                str = '+'.join(seq[bz*i:bz*i+bz])\n",
    "            str = str + ')' if i == blks-1 else str + '+'\n",
    "            ei = self.ax.text(0.01,(0.75-i*0.25), f\"${str}$\",fontsize=9, transform = self.ax.transAxes, ma='left', va='top' )\n",
    "            self.eqtext.append(ei)\n",
    "        self.fig.canvas.draw()\n",
    "\n",
    "    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name='FeatureMap'></a>\n",
    "##  Create Overfitting...Polynomial Feature Mapping\n",
    "In real data sets, the boundary between \"True\" and \"False\" features is rarely a straight line. To create a non-linear decision boundary, our model will need to support non-linear features. Concretely, if we have two features in our feature set $x_1$ and $x_2$ we can build a model of degree 2:\n",
    "$$f_{\\mathbf{w},b} = w_0x_1 + w_1x_2 + w_2x_1^2 + w_3x_1x_2 + w_4x_2^2 + b \\tag{1} $$\n",
    "To do this, we must convert our two feature data set into a feature set with all combinations of our features. The routine `map_feature` was provided above to do exactly this."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Shape before feature mapping: (3, 2)\n",
      "[[2 0]\n",
      " [0 3]\n",
      " [2 3]] \n",
      "\n",
      "Shape after feature mapping: (3, 5)\n",
      "[[2 0 4 0 0]\n",
      " [0 3 0 0 9]\n",
      " [2 3 4 6 9]]\n"
     ]
    }
   ],
   "source": [
    "X_tmp = np.array([[2,0],[0,3],[2,3]] )  # values selected to illustrated equation\n",
    "print(\"Shape before feature mapping:\", X_tmp.shape)\n",
    "print(X_tmp, \"\\n\")\n",
    "\n",
    "mapped_X, descrip =  map_feature(X_tmp[:, 0], X_tmp[:, 1],degree = 2)\n",
    "\n",
    "print(\"Shape after feature mapping:\", mapped_X.shape)\n",
    "print(mapped_X)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Compare the results with equation (1) above.\n",
    "\n",
    "Of course, we don't have to stop at two. The `degree` argument to map_features will determine the degree of the polynomial that is created. The degree will be determined by the complexity of the curve you are trying to follow. Increasing the degree will allow the model to follow more irregular boundaries, but can also allow for overfitting. The number of features/parameters grows exponentially as all of the cross terms are included. Sklearn [`PolynomialFeatures`](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.PolynomialFeatures.html) can also be used to create feature maps.\n",
    "\n",
    "Lets convert our dataset above to support degree 6."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Original shape of data: (3, 2)\n",
      "w_{0}x_0 + w_{1}x_1 + w_{2}x_0^{2} + w_{3}x_0x_1 + w_{4}x_1^{2} + w_{5}x_0^{3} + w_{6}x_0^{2}x_1 + w_{7}x_0x_1^{2} + w_{8}x_1^{3} + w_{9}x_0^{4} + w_{10}x_0^{3}x_1 + w_{11}x_0^{2}x_1^{2} + w_{12}x_0x_1^{3} + w_{13}x_1^{4} + w_{14}x_0^{5} + w_{15}x_0^{4}x_1 + w_{16}x_0^{3}x_1^{2} + w_{17}x_0^{2}x_1^{3} + w_{18}x_0x_1^{4} + w_{19}x_1^{5} + w_{20}x_0^{6} + w_{21}x_0^{5}x_1 + w_{22}x_0^{4}x_1^{2} + w_{23}x_0^{3}x_1^{3} + w_{24}x_0^{2}x_1^{4} + w_{25}x_0x_1^{5} + w_{26}x_1^{6} +  b\n",
      "Shape after feature mapping: (3, 27)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'w_{0}x_0 + w_{1}x_1 + w_{2}x_0^{2} + w_{3}x_0x_1 + w_{4}x_1^{2} + w_{5}x_0^{3} + w_{6}x_0^{2}x_1 + w_{7}x_0x_1^{2} + w_{8}x_1^{3} + w_{9}x_0^{4} + w_{10}x_0^{3}x_1 + w_{11}x_0^{2}x_1^{2} + w_{12}x_0x_1^{3} + w_{13}x_1^{4} + w_{14}x_0^{5} + w_{15}x_0^{4}x_1 + w_{16}x_0^{3}x_1^{2} + w_{17}x_0^{2}x_1^{3} + w_{18}x_0x_1^{4} + w_{19}x_1^{5} + w_{20}x_0^{6} + w_{21}x_0^{5}x_1 + w_{22}x_0^{4}x_1^{2} + w_{23}x_0^{3}x_1^{3} + w_{24}x_0^{2}x_1^{4} + w_{25}x_0x_1^{5} + w_{26}x_1^{6} +  b'"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X_train = np.array([[2,0],[0,3],[2,3]] )  # values selected to illustrated equation\n",
    "print(\"Original shape of data:\", X_train.shape)\n",
    "degree = 6\n",
    "X_mapped, equation =  map_feature(X_train[:, 0], X_train[:, 1], degree)\n",
    "print(equation)\n",
    "print(\"Shape after feature mapping:\", X_mapped.shape)\n",
    "foo=md(f\"equation: ${equation}$\")\n",
    "foo2 = f\"{equation}\"\n",
    "foo2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note, with a degree 6 polynomial, we now have 27 features!\n",
    "<a name='FitModel'></a>\n",
    "## Fit the model\n",
    "\n",
    "We are going to use the `LogisticRegression` feature of SkLearn that was introduced in a previous lab. One thing to note, this routine has regularization built in. We will enable and disable that capability to highlight aspects of over fitting. To disable it, the command line argument `penalty` is set to `none`. When enabled, the `C` command line argument controls how much regularization is used. \n",
    "\n",
    "The first step is to scale the data. It turns out, with the quadratic terms, the model won't fit without regularization,which we aren't using in this first experiment, so we will scale the data. This is similar to the feature scaling/mean normalization introduced in the first week."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_mapped_scaled, X_mu, X_sigma  = zscore_normalize_features(X_mapped)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "ename": "ValueError",
     "evalue": "operands could not be broadcast together with shapes (3,1) (50,1) ",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mValueError\u001b[0m                                Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-15-dd746ad9cbaa>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m      4\u001b[0m \u001b[0mnum_iters\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m1000000\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      5\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 6\u001b[0;31m \u001b[0mw_out\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mb_out\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0m_\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mgradient_descent\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mX_mapped_scaled\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0my_train\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mw_in\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mb_in\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0malpha\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mnum_iters\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlogistic\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mTrue\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m      7\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34mf\"\\nupdated parameters: w:{w_out}, b:{b_out}\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/work/lab_utils_common.py\u001b[0m in \u001b[0;36mgradient_descent\u001b[0;34m(X, y, w_in, b_in, alpha, num_iters, logistic, lambda_, verbose)\u001b[0m\n\u001b[1;32m    286\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    287\u001b[0m         \u001b[0;31m# Calculate the gradient and update the parameters\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 288\u001b[0;31m         \u001b[0mdj_db\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mdj_dw\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcompute_gradient_matrix\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mX\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mw\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mb\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlogistic\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlambda_\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    289\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    290\u001b[0m         \u001b[0;31m# Update Parameters using w, b, alpha and gradient\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/work/lab_utils_common.py\u001b[0m in \u001b[0;36mcompute_gradient_matrix\u001b[0;34m(X, y, w, b, logistic, lambda_)\u001b[0m\n\u001b[1;32m    245\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    246\u001b[0m     \u001b[0mf_wb\u001b[0m  \u001b[0;34m=\u001b[0m \u001b[0msigmoid\u001b[0m\u001b[0;34m(\u001b[0m \u001b[0mX\u001b[0m \u001b[0;34m@\u001b[0m \u001b[0mw\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0mb\u001b[0m \u001b[0;34m)\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mlogistic\u001b[0m \u001b[0;32melse\u001b[0m  \u001b[0mX\u001b[0m \u001b[0;34m@\u001b[0m \u001b[0mw\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0mb\u001b[0m      \u001b[0;31m# (m,n)(n,1) = (m,1)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 247\u001b[0;31m     \u001b[0merr\u001b[0m   \u001b[0;34m=\u001b[0m \u001b[0mf_wb\u001b[0m \u001b[0;34m-\u001b[0m \u001b[0my\u001b[0m                                              \u001b[0;31m# (m,1)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    248\u001b[0m     \u001b[0mdj_dw\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m/\u001b[0m\u001b[0mm\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0mX\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mT\u001b[0m \u001b[0;34m@\u001b[0m \u001b[0merr\u001b[0m\u001b[0;34m)\u001b[0m                                   \u001b[0;31m# (n,m)(m,1) = (n,1)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    249\u001b[0m     \u001b[0mdj_db\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m/\u001b[0m\u001b[0mm\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msum\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0merr\u001b[0m\u001b[0;34m)\u001b[0m                                   \u001b[0;31m# scalar\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;31mValueError\u001b[0m: operands could not be broadcast together with shapes (3,1) (50,1) "
     ]
    }
   ],
   "source": [
    "w_in  = np.zeros_like(X_mapped_scaled[0])\n",
    "b_in  = 0.\n",
    "alpha = 0.01\n",
    "num_iters = 1000000\n",
    "\n",
    "w_out, b_out, _ = gradient_descent(X_mapped_scaled, y_train, w_in, b_in, alpha, num_iters, logistic=True) \n",
    "print(f\"\\nupdated parameters: w:{w_out}, b:{b_out}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'w_out' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mNameError\u001b[0m                                 Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-16-070006b76bcf>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m      2\u001b[0m \u001b[0;31m#b_out_5M = b_out\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      3\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0mlab_utils_common\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mcompute_cost_matrix\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0msigmoid\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 4\u001b[0;31m \u001b[0mcompute_cost_matrix\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mX_mapped_scaled\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0my_train\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mw_out\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mreshape\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mb_out\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlogistic\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mTrue\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlambda_\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mNameError\u001b[0m: name 'w_out' is not defined"
     ]
    }
   ],
   "source": [
    "#w_out_5M = w_out\n",
    "#b_out_5M = b_out\n",
    "from lab_utils_common import compute_cost_matrix, sigmoid\n",
    "compute_cost_matrix(X_mapped_scaled, y_train, w_out.reshape(-1,1), b_out, logistic=True, lambda_=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that we have a trained model, lets map the Original Data (not predicted) along with the decision boundary we derive from the model. Examine `plot_decision_boundary` above to see the details of how this is accomplished."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "ccbb65e49afe4cafb22ad2544ddbe5f4",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "ename": "NameError",
     "evalue": "name 'w_out' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mNameError\u001b[0m                                 Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-17-1af3dec91b74>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[0;31m#plot_decision_boundary([-1,1],[-1,1], y_train,lr.predict, scaler=scaler )\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      2\u001b[0m \u001b[0mfig\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0max\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mplt\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msubplots\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mfigsize\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m4\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m4\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m plot_decision_boundary(ax,[-1,1],[-1,1], y_train, predict_logistic, w_out, \n\u001b[0m\u001b[1;32m      4\u001b[0m                        b_out, scaler=True, mu=X_mu, sigma=X_sigma, degree=degree )\n\u001b[1;32m      5\u001b[0m \u001b[0mplot_data\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mX_train\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0my_train\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0max\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0ms\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m10\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;31mNameError\u001b[0m: name 'w_out' is not defined"
     ]
    }
   ],
   "source": [
    "#plot_decision_boundary([-1,1],[-1,1], y_train,lr.predict, scaler=scaler )\n",
    "fig,ax = plt.subplots(1,1, figsize=(4,4))\n",
    "plot_decision_boundary(ax,[-1,1],[-1,1], y_train, predict_logistic, w_out, \n",
    "                       b_out, scaler=True, mu=X_mu, sigma=X_sigma, degree=degree )\n",
    "plot_data(X_train,y_train,ax,s=10)\n",
    "ax.set_title(f\"Example of overfitting, \\ndegree {degree}, no regularization\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<details>\n",
    "<summary>\n",
    "    <b>**Expected Output**:</b>\n",
    "</summary>\n",
    "\n",
    "<center> <img  src=\"./images/C1_W3_Lab07_overfitting.PNG\" width=\"440\" height=\"440\"/>   <center/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "ename": "ValueError",
     "evalue": "Found input variables with inconsistent numbers of samples: [3, 50]",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mValueError\u001b[0m                                Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-18-d2c2fcbf08e5>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[0;31m# create and fit the model using our mapped_X feature set.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      2\u001b[0m \u001b[0mlr\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mLogisticRegression\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mpenalty\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m'none'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmax_iter\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m10000\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0mlr\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfit\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mX_mapped_scaled\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0my_train\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;32m/opt/conda/lib/python3.7/site-packages/sklearn/linear_model/_logistic.py\u001b[0m in \u001b[0;36mfit\u001b[0;34m(self, X, y, sample_weight)\u001b[0m\n\u001b[1;32m   1525\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1526\u001b[0m         X, y = check_X_y(X, y, accept_sparse='csr', dtype=_dtype, order=\"C\",\n\u001b[0;32m-> 1527\u001b[0;31m                          accept_large_sparse=solver != 'liblinear')\n\u001b[0m\u001b[1;32m   1528\u001b[0m         \u001b[0mcheck_classification_targets\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0my\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1529\u001b[0m         \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mclasses_\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0munique\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0my\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/opt/conda/lib/python3.7/site-packages/sklearn/utils/validation.py\u001b[0m in \u001b[0;36mcheck_X_y\u001b[0;34m(X, y, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, multi_output, ensure_min_samples, ensure_min_features, y_numeric, warn_on_dtype, estimator)\u001b[0m\n\u001b[1;32m    763\u001b[0m         \u001b[0my\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mastype\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfloat64\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    764\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 765\u001b[0;31m     \u001b[0mcheck_consistent_length\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mX\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    766\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    767\u001b[0m     \u001b[0;32mreturn\u001b[0m \u001b[0mX\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/opt/conda/lib/python3.7/site-packages/sklearn/utils/validation.py\u001b[0m in \u001b[0;36mcheck_consistent_length\u001b[0;34m(*arrays)\u001b[0m\n\u001b[1;32m    210\u001b[0m     \u001b[0;32mif\u001b[0m \u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0muniques\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m>\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    211\u001b[0m         raise ValueError(\"Found input variables with inconsistent numbers of\"\n\u001b[0;32m--> 212\u001b[0;31m                          \" samples: %r\" % [int(l) for l in lengths])\n\u001b[0m\u001b[1;32m    213\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    214\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;31mValueError\u001b[0m: Found input variables with inconsistent numbers of samples: [3, 50]"
     ]
    }
   ],
   "source": [
    "# create and fit the model using our mapped_X feature set.\n",
    "lr = LogisticRegression(penalty='none', max_iter=10000)\n",
    "lr.fit(X_mapped_scaled,y_train)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "ename": "NotFittedError",
     "evalue": "This LogisticRegression instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mNotFittedError\u001b[0m                            Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-19-3a6c9c1529d1>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlr\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mscore\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mX_mapped_scaled\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0my_train\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m      2\u001b[0m \u001b[0mw_lr\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mlr\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcoef_\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mreshape\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      3\u001b[0m \u001b[0mb_lr\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mlr\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mintercept_\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      4\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mw_lr\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mb_lr\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      5\u001b[0m \u001b[0;31m#plot_decision_boundary([-1,1],[-1,1], y_train,lr.predict, scaler=scaler )\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/opt/conda/lib/python3.7/site-packages/sklearn/base.py\u001b[0m in \u001b[0;36mscore\u001b[0;34m(self, X, y, sample_weight)\u001b[0m\n\u001b[1;32m    367\u001b[0m         \"\"\"\n\u001b[1;32m    368\u001b[0m         \u001b[0;32mfrom\u001b[0m \u001b[0;34m.\u001b[0m\u001b[0mmetrics\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0maccuracy_score\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 369\u001b[0;31m         \u001b[0;32mreturn\u001b[0m \u001b[0maccuracy_score\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0my\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpredict\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mX\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0msample_weight\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0msample_weight\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    370\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    371\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/opt/conda/lib/python3.7/site-packages/sklearn/linear_model/_base.py\u001b[0m in \u001b[0;36mpredict\u001b[0;34m(self, X)\u001b[0m\n\u001b[1;32m    291\u001b[0m             \u001b[0mPredicted\u001b[0m \u001b[0;32mclass\u001b[0m \u001b[0mlabel\u001b[0m \u001b[0mper\u001b[0m \u001b[0msample\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    292\u001b[0m         \"\"\"\n\u001b[0;32m--> 293\u001b[0;31m         \u001b[0mscores\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdecision_function\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mX\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    294\u001b[0m         \u001b[0;32mif\u001b[0m \u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mscores\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshape\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    295\u001b[0m             \u001b[0mindices\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0mscores\u001b[0m \u001b[0;34m>\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mastype\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mint\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/opt/conda/lib/python3.7/site-packages/sklearn/linear_model/_base.py\u001b[0m in \u001b[0;36mdecision_function\u001b[0;34m(self, X)\u001b[0m\n\u001b[1;32m    264\u001b[0m             \u001b[0;32mclass\u001b[0m \u001b[0mwould\u001b[0m \u001b[0mbe\u001b[0m \u001b[0mpredicted\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    265\u001b[0m         \"\"\"\n\u001b[0;32m--> 266\u001b[0;31m         \u001b[0mcheck_is_fitted\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    267\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    268\u001b[0m         \u001b[0mX\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcheck_array\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mX\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0maccept_sparse\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m'csr'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/opt/conda/lib/python3.7/site-packages/sklearn/utils/validation.py\u001b[0m in \u001b[0;36mcheck_is_fitted\u001b[0;34m(estimator, attributes, msg, all_or_any)\u001b[0m\n\u001b[1;32m    965\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    966\u001b[0m     \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mattrs\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 967\u001b[0;31m         \u001b[0;32mraise\u001b[0m \u001b[0mNotFittedError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mmsg\u001b[0m \u001b[0;34m%\u001b[0m \u001b[0;34m{\u001b[0m\u001b[0;34m'name'\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mtype\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mestimator\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m__name__\u001b[0m\u001b[0;34m}\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    968\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    969\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;31mNotFittedError\u001b[0m: This LogisticRegression instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator."
     ]
    }
   ],
   "source": [
    "print(lr.score(X_mapped_scaled, y_train))\n",
    "w_lr = lr.coef_.reshape(-1,)\n",
    "b_lr = lr.intercept_\n",
    "print(w_lr,b_lr)\n",
    "#plot_decision_boundary([-1,1],[-1,1], y_train,lr.predict, scaler=scaler )\n",
    "fig,ax = plt.subplots(1,1, figsize=(4,4))\n",
    "plot_decision_boundary(ax,[-1,1],[-1,1], y_train, predict_logistic, w_lr, \n",
    "                       b_lr, scaler=True, mu=X_mu, sigma=X_sigma, degree=degree )\n",
    "plot_data(X_train,y_train,ax,s=10)\n",
    "ax.set_title(f\"Example of overfitting, \\ndegree {degree}, no regularization\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'w_lr' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mNameError\u001b[0m                                 Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-20-ae290b086af3>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0mlab_utils_common\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mcompute_cost_matrix\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0msigmoid\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mcompute_cost_matrix\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mX_mapped_scaled\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0my_train\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mw_lr\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mreshape\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mb_lr\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlogistic\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mTrue\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlambda_\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mNameError\u001b[0m: name 'w_lr' is not defined"
     ]
    }
   ],
   "source": [
    "from lab_utils_common import compute_cost_matrix, sigmoid\n",
    "compute_cost_matrix(X_mapped_scaled, y_train, w_lr.reshape(-1,1), b_lr, logistic=True, lambda_=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'f_wb' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mNameError\u001b[0m                                 Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-21-a7ea59e1091d>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mf_wb\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshape\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mNameError\u001b[0m: name 'f_wb' is not defined"
     ]
    }
   ],
   "source": [
    "f_wb.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'w_lr' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mNameError\u001b[0m                                 Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-22-ca973be0adcd>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mf_wb\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mX_mapped_scaled\u001b[0m \u001b[0;34m@\u001b[0m \u001b[0mw_lr\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0mb_lr\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m      2\u001b[0m \u001b[0mBlr\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m-\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0my_train\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0mf_wb\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlog\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m+\u001b[0m\u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mexp\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mf_wb\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      3\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mi\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mrange\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m5\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m5\u001b[0m\u001b[0;34m+\u001b[0m\u001b[0;36m6\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      4\u001b[0m     \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mf_wb\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0;36m6\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0;36m6\u001b[0m\u001b[0;34m+\u001b[0m\u001b[0;36m6\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      5\u001b[0m     \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0my_train\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0;36m6\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0;36m6\u001b[0m\u001b[0;34m+\u001b[0m\u001b[0;36m6\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;31mNameError\u001b[0m: name 'w_lr' is not defined"
     ]
    }
   ],
   "source": [
    "f_wb = X_mapped_scaled @ w_lr + b_lr\n",
    "Blr = -(y_train * f_wb) + np.log(1+np.exp(f_wb))\n",
    "for i in range(5,5+6):\n",
    "    print(f_wb[i*6:i*6+6])\n",
    "    print(y_train[i*6:i*6+6])\n",
    "    print(Blr[i*6:i*6+6])\n",
    "    print()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Wow, the model has done an amazing job of separating the data! However, that is probably not what is desired. \n",
    "We can take two approaches to reducing overfitting:\n",
    "- regularization \n",
    "- reduce the degree of the polynomial."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name='ReduceOverfitting'></a>\n",
    "## Reducing Overfitting using regularization\n",
    "The next labs will cover regularization in more detail, so we will just explore this briefly.\n",
    "Lets fit the model again, but this time include regularization. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# create and fit the model using our mapped_X feature set.\n",
    "lr = LogisticRegression(max_iter=1000, C=1)\n",
    "lr.fit(mapped_X,y_train)\n",
    "\n",
    "# print an evaluation of the fit, 1 is best.\n",
    "print(\"fitting score:\",lr.score(mapped_X, y_train))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plot_decision_boundary([-1,1],[-1,1], y_train,lr.predict)\n",
    "plot_data(X_train,y_train)\n",
    "plt.title(\"Example of overfitting, degree 6, with regularization, C=1\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The decision boundary is much more reasonable with some regularizationg.\n",
    "Change the value of `C` above to try more or less regularization. C must be strictly positive. Values less than 1 maximumize regularization while large values minimize regularization."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Reduce the degree of the polynomial\n",
    "A degree 6 polynomial may be more than is required! We can reduce the values to limit the model.\n",
    "To do this, we will need to regenerate our mapped data and refit the model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Original shape of data:\", X_train.shape)\n",
    "degree = 2\n",
    "mapped_X =  map_feature(X_train[:, 0], X_train[:, 1],degree)\n",
    "\n",
    "print(\"Shape after feature mapping:\", mapped_X.shape)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# create and fit the model using our mapped_X feature set.\n",
    "lr = LogisticRegression(penalty='none', max_iter=1000, C=1)\n",
    "lr.fit(mapped_X,y_train)\n",
    "\n",
    "# print an evaluation of the fit, 1 is best.\n",
    "print(\"fit score:\", lr.score(mapped_X, y_train))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plot_decision_boundary([-1,1],[-1,1], y_train,lr.predict)\n",
    "plot_data(X_train,y_train)\n",
    "plt.title(\"Example of overfitting, degree 2, with no regularization\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Not bad! Of course, in this case, we knew ahead of time the data was quadratic and that a degree two polynomial would be a good choice. Try varying `degree` above to see the impact of polynomial degree on overfitting."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
